Progenitor V3.3 LLaMa 70B
This project aims to create a language model with better performance by fusing multiple pre-trained language models of 70B scale. Based on the Llama 3.3 instruction model, the Linear DELLA fusion method is used for model fusion.
Large Language Model
Transformers